Conversation
| DatasetConfiguration: Configuration with airt_harms dataset. | ||
| """ | ||
| return DatasetConfiguration(dataset_names=["airt_harms"], max_dataset_size=4) | ||
| return DatasetConfiguration(dataset_names=["airt_harms"]) |
There was a problem hiding this comment.
Any reason for removing max dataset size? I think we have this set so that our integration tests don't run the entire dataset by default, which would slow it down.
Is there anywhere where the user provides a max prompt number that we could pass through to here if its set, and otherwise if not set we keep at default of 4?
There was a problem hiding this comment.
I tried making it a user-provided parameter, but the method implicitly belongs to the Scenario superclass since it's called in initialize_async and only exposed as self._dataset_config per instance. I think this is a good feature for a future scenario refactor but is out of scope here, so I put it back to 4 for simplicity's sake and for the integration tests.
| all_templates = TextJailBreak.get_jailbreak_templates() | ||
|
|
||
| if jailbreak_names: | ||
| diff = set(jailbreak_names) - set(all_templates) |
There was a problem hiding this comment.
curiosity: whoa my brain doesn't compute this logic lol
is the diff = the names that are in jailbreak_names and not in all_templates
could we make the same comparison by checking for name in jailbreak_names if name not in set(all_templates) raise error and this is just a more efficient way of doing that?
There was a problem hiding this comment.
You computed it correctly 🙂 but giving it a second look it was really not readable, so I added a comment that explains how it works. The comparison is the same as the one you described but more efficient
Description
Adding more features to the Jailbreak scenario! Major changes:
JailbreakStrategynow supports multiple different attack types viaManyShot,PromptSending,Crescendo, andRedTeamingvalues.SINGLE_TURNandMULTI_TURNaggregates;PYRIThas been deprecated.k_jailbreaks,num_tries, andjailbreak_names; these allow you to choose a random number of jailbreaks, how many times to try each jailbreak, and to choose which jailbreaks specifically you'd like to use respectively. Note thatk_jailbreaksandjailbreak_namesare mutually exclusive.Tests and Documentation